1. Overview of Digital Filter Design 
2. FIR Filter Design 
1. Linear Phase Filters 
2. Window Design Method 
3. Frequency Sampling Design Method for FIR filters 
4. Parks-McClellan FIR Filter Design 
5. Lagrange Interpolation 
3. IIR Filter Design 
1. Overview of IIR Filter Design 
. Prototype Analog Filter Design 
. IR Digital Filter Design via the Bilinear Transform 
. Impulse-Invariant Design 
. Digital-to-Digital Frequency Transformations 
. Prony's Method 
. Linear Prediction 


NOD OB WN 


Overview of Digital Filter Design 
Advantages of FIR filters 


1. Straight forward conceptually and simple to implement 

2. Can be implemented with fast convolution 

3. Always stable 

4. Relatively insensitive to quantization 

5. Can have linear phase (same time delay of all frequencies) 


Advantages of IIR filters 


1. Better for approximating analog systems 

2. For a given magnitude response specification, IIR filters often require 
much less computation than an equivalent FIR, particularly for narrow 
transition bands 


Both FIR and IIR filters are very important in applications. 
Generic Filter Design Procedure 


1. Choose a desired response, based on application requirements 
2. Choose a filter class 

3. Choose a quality measure 

4. Solve for the filter in class 2 optimizing criterion in 3 


Perspective on FIR filtering 


Most of the time, people do L°® optimal design, using the Parks-McClellan 
algorithm, This is probably the second most important technique in 
"classical" signal processing (after the Cooley-Tukey (radix-2) FFT). 


Most of the time, FIR filters are designed to have linear phase. The most 
important advantage of FIR filters over IIR filters is that they can have 
exactly linear phase. There are advanced design techniques for minimum- 
phase filters, constrained L? optimal designs, etc. (see chapter 8 of text). 
However, if only the magnitude of the response is important, IIR filers 
usually require much fewer operations and are typically used, so the bulk of 
FIR filter design work has concentrated on linear phase designs. 


Linear Phase Filters 


In general, for—-t@#<w<7 


Why is this important? A linear phase response gives the same time delay for ALL frequencies! (Remember the 
shift theorem.) This is very desirable in many applications, particularly when the appearance of the time-domain 
waveform is of interest, such as in an oscilloscope. (see [link]) 


linear phase filter non-linear phase filter 
x(t) y(t) 
LP-FIR IIR 


athe 


Restrictions on h(n) to get linear phase 
Equation: 


Hw) = xg d(nje) 
h(0) + h(1)e~) + h(2)e~ 4) +... + A(M = 1)e~(-D) 


M- M-1 M-1 


en (iw) (r(o)e™ 7 +... +h(M — 1)e~(*5*)) 


I 


= e (r) ((A(0) + h(M — 1)) cos(*=1w) + (A(1) + A(M — 2)) cos(*3w) +... +4 (A(0) sin| 


2 


For linear phase, we require the right side of [link] to be e~ 0) (real,positive function of w). For 09 = 0, we 
thus require 


h(0) + h(M — 1) = real number 
h(0) — h(M — 1) = pure imaginary number 
h(1) + h(M — 2) = pure real number 


h(1) — h(M — 2) = pure imaginary number 


Thus h(k) = h’(M — 1 — k) is anecessary condition for the right side of [link] to be real valued, for 9 = 0. 
For 09 = 4, or e (%) — 4, we require 
h(0) + h(M — 1) = pure imaginary 
h(0) — h(M — 1) = pure real number 


= h(k) = — (n*(M i k)) 


Usually, one is interested in filters with real-valued coefficients, or see [link] and [link]. 


Meven ; 


- M-1 


integer fraction 


2 2 


99 = 0 (Symmetric Filters). 
h(k) = h(M —1-k). 


Meven ; 


99 = + (Anti-Symmetric Filters). 
h(k) = —h(M —-1-k). 


Filter design techniques are usually slightly different for each of these four different filter types. We will study the 
most common case, symmetric-odd length, in detail, and often leave the others for homework or tests or for when 
one encounters them in practice. Even-symmetric filters are often used; the anti-symmetric filters are rarely used in 
practice, except for special classes of filters, like differentiators or Hilbert transformers, in which the desired 
response is anti-symmetric. 


So far, we have satisfied the condition that H(w) = A(w)e~ (#90) e— (ie) where A(w) is real-valued. However, 
we have not assured that A(w) is non-negative. In general, this makes the design techniques much more difficult, 
so most FIR filter design methods actually design filters with Generalized Linear Phase: 

H(w) = A(w)e*>), where A(w) is real-valued, but possible negative. A(w) is called the amplitude of the 
frequency response. 


Note: A(w) usually goes negative only in the stopband, and the stopband phase response is generally 
unimportant. 


lifx>0 


Note: |H(w)| = +(A(w)) = A(w)e~ ("2 (1-sien(A)))) where sign (x) = { Tet 
—l if ¢ 


Example: 
Lowpass Filter 
Desired |H(@)| 


4 
| 


—Tt —w. 


Desired ZH(q) 


Actual |H(o)| 


A(w) goes negative. 


Actual ZH(o) 


27 phase jumps due to 
periodicity of phase. 7 
phase jumps due to 
sign change in A(w). 


Time-delay introduces generalized linear phase. 


Note: For odd-length FIR filters, a linear-phase design procedure is equivalent to a zero-phase design procedure 
followed by an “— -sample delay of the impulse response, For even-length filters, the delay is non-integer, and 
the linear phase must be incorporated directly in the desired response! 


Window Design Method 


The truncate-and-delay design procedure is the simplest and most obvious FIR design procedure. 
Exercise: 


Problem: Is it any Good? 
Solution: 


Yes; in fact it's optimal! (in a certain sense) 


L2 optimization criterion 


find Vn,0 <n < M —1: (A[n]), maximizing the energy difference between the desired response and the actual 
response: i.e., find 


min {ln} f (1a(w) — #4(w))) aw} 


TT 


by Parseval's relationship 
Equation: 


min pin {ainl, J”. (\Ha(w) — H(w)|)? d w} 


I 


2 Ooo (|ealn] — hin)” 


= 2m (Soytce (lealn] — Afr)? + ONG! (Vhaln] — Afr)? + 
Since Math input error this becomes 
wT -1 M-— y lore) 
min jn] {ata | (|Ha(w) — H(w)|)? av} = SO (|halnjl)? + * +S (\haln] 
—T h=—oo 2, n=M 


Note: h[n] has no influence on the first and last sums. 


The best we can do is let 


nl lanes if0<n<M-1 


if else 


a -{' if 0 <n(M -1) 


if else 


is optimal in a least-total-sqaured-error ( 2, or energy) sense! 
Exercise: 


Problem: Why, then, is this design often considered undersirable? 


Solution: 


Gibbs Phenomenon 


A(w), small M A(w), large M 


_-~0.11 + 1.0 


™~_0.11 


For desired spectra with discontinuities, the least-square designs are poor in a minimax (worst-case, or D9) error 
sense. 


Window Design Method 


Apply a more gradual truncation to reduce "ringing" (Gibb's Phenomenon) 
Yn0<n<M-—t1h|n|=hg[n|w[n] : (n0<n<M-—1h[n|=ha|n|w{[n]) 


Note: H(w) = Ha(w)*W(w) 


The window design procedure (except for the boxcar window) is ad-hoc and not optimal in any usual sense. 
However, it is very simple, so it is sometimes used for "quick-and-dirty" designs of if the error criterion is itself 
heurisitic. 


Frequency Sampling Design Method for FIR filters 


Given a desired frequency response, the frequency sampling design method designs a filter 
with a frequency response exactly equal to the desired response at a particular set of 
frequencies wy. 
Equation: 

Procedure 


n=0 


M-1 
Vk, | [o, | eee .N = 1] : (21009 — S> Honjo) 


Note: Desired Response must incluce linear phase shift (if linear phase is desired) 


Exercise: 


Problem: What is H4(w) for an ideal lowpass filter, cotoff at w.? 


Solution: 


e ("5") if —We<w<u, 
0 if (-7<w< -w,) V (we <w<n7) 


Note: This set of linear equations can be written in matrix form 


Equation: 


Equation: 


Ha(wo) e (iw00) eo (wel) e- (iwo(M-1)) h(0) 
Ha(w1) e~ (iu10) eo (iwil) e- (wr (M-1)) h(1) 
Ha(wn-1) e (twu-10) — @—(twau-11) e~ (iwu-1(M—1)) h(M — 1) 
or 
Hy= Wh 
So 
Equation: 
h=W'Ha 


Note: W is a square matrix for N = M, and invertible as long as w; 4 w; + 2ml,i Fj 


Important Special Case 


What if the frequencies are equally spaced between 0 and 27, i.e. w, = 2ak +a 


Then 
= - Inkn . ee . - Inkn 
Ha(we) = S> A(nje eC) = Se (A(nje Ho Jel i) = DFT! 
n=0 n=0 
sO 
. 1 a - Iank 
h(n)e~ %”) iW 2 Ha(wz)e’ ™ 
or 
etan pte - 2ank . 
hin] = 5 S> Halwgle = ce" IDFT [Halwa] 
k=0 


Important Special Case #2 


h|n] symmetric, linear phase, and has real coefficients. Since h[n] = h|M — nl, there are 
only x degrees of freedom, and only a linear equations are required. 
Equation: 


Hwy] = cet h[nje~ sr) 
z hin] (e~ (iwxn) 4 e—(ur(M—n—1))) if M even 
= re hin] (e~ (iwen) 4 © SE) (n[ Je — (iw )) if M odd 


— (iene Not, " 3 h[n] cos(w;, (44+ — n)) if Meven 


e~ (wn ayia h[n| cos(w;, (“44+ —n)) + h[4+*] if M odd 
Removing linear phase from both sides yields 


2S na h(n] cos(w, (44+ —n)) if Meven 
Dak ra h[n| cos(w, (4 —n)) + A[ 44] if Modd 


A(wr) = 


Due to symmetry of response for real coefficients, only 4 w on w € [0, 77) need be 
specified, with the frequencies —w, thereby being implicitly defined also. Thus we have 


x real-valued simultaneous linear equations to solve for h|n]. 


Special Case 2a 


h|n] symmetric, odd length, linear phase, real coefficients, and w; equally spaced: 
Vk,O<k<M-1: (w= @) 
Equation: 

h{n] = IDFT [Ha(wrx)| 


= Dy A(wee Or) Mo eit 


= Ay) Ake Gr (ms) 


To yield real coefficients, A(w) mus be symmetric 
(A(w) = A(—w)) = (AlA] = A[M — k)) 


Equation: 


ial ( A(0) +35,%, AlK] (cit 4) “i eee) )) 
(A(0) +25,.% AIR] cos(2## (n — 454))) 


(A(0) +20,%) AlK](-1)* cos(2## (n + 4))) 


| 
a oe 


Simlar equations exist for even lengths, anti-symmetric, and a = - filter forms. 


Comments on frequency-sampled design 


This method is simple conceptually and very efficient for equally spaced samples, since 
h|n] can be computed using the IDFT. 


H(w) for a frequency sampled design goes exactly through the sample points, but it may 
be very far off from the desired response for w ~ wy. This is the main problem with 
frequency sampled design. 


Possible solution to this problem: specify more frequency samples than degrees of 
freedom, and minimize the total error in the frequency response at all of these samples. 


Extended frequency sample design 


For the samples H(w,) where0 <k < M-—1andWN > M, find h|n], where 
0<n< M —1 minimizing || Hy(w,) — H(w,) || 


For || / ||, norm, this becomes a linear programming problem (standard packages 
availble!) 


Here we will consider the || / ||, norm. 


To minimize the || 1 ||, norm; that is, )>”)' |Ha(we) — H(wx)|, we have an 
overdetermined set of linear equations: 


eo (iwo) =p (iwo( M1) Ha(wo) 
Ha(w1) 
. a; * h mn - 
—(iwy_10) —(iwn—1(M—1)) j 
e wee, E Ha(wn-1) 


or 


Wh= Ha 


— = Lis 
The minimum error norm solution is well known to be h = (ww) W Ag; 


_—_ = 
(ww) W is well known as the pseudo-inverse matrix. 


Note:Extended frequency sampled design discourages radical behavior of the frequency 
response between samples for sufficiently closely spaced samples. However, the actual 
frequency response may no longer pass exactly through any of the Hy(w,). 


Parks-McClellan FIR Filter Design 


The approximation tolerances for a filter are very often given in terms of the maximum, or 
worst-case, deviation within frequency bands. For example, we might wish a lowpass filter in a 
(16-bit) CD player to have no more than 5 bit deviation in the pass and stop bands. 


t= 2 (Fig) Hie FF lel ea 
Ha) =| ar = | (w)|= 517 lw] < wy 


sir > |H(w)| if ws < |w| <a 


The Parks-McClellan filter design method efficiently designs linear-phase FIR filters that are 
optimal in terms of worst-case (minimax) error. Typically, we would like to have the shortest- 
length filter achieving these specifications. Figure [link] illustrates the amplitude frequency 
response of such a filter. 


The black boxes on the left and right are the passbands, the 

black boxes in the middle represent the stop band, and the 

space between the boxes are the transition bands. Note that 
overshoots may be allowed in the transition bands. 


Exercise: 


Problem: Must there be a transition band? 


Solution: 


Yes, when the desired response is discontinuous. Since the frequency response of a finite- 
length filter must be continuous, without a transition band the worst-case error could be no 
less than half the discontinuity. 


Formal Statement of the L-o»o (Minimax) Design Problem 


For a given filter length (/) and type (odd length, symmetric, linear phase, for example), and a 
relative error weighting function W(w), find the filter coefficients minimizing the maximum 
error 


argminargmax |/(w)| =argmin || E(w) || ,, 
h weF h 


where 
E(w) = W(w) (Haw) — H()) 


and F' is a compact subset of w € [0, 7] (i.e., all w in the passbands and stop bands). 


Note: Typically, we would often rather specify || E(w) ||,, < 6 and minimize over M and h; 
however, the design techniques minimize 6 for a given M. One then repeats the design 
procedure for different M until the minimum MM satisfying the requirements is found. 


We will discuss in detail the design only of odd-length symmetric linear-phase FIR filters. 
Even-length and anti-symmetric linear phase FIR filters are essentially the same except for a 
slightly different implicit weighting function. For arbitrary phase, exactly optimal design 
procedures have only recently been developed (1990). 


Outline of L-co Filter Design 


The Parks-McClellan method adopts an indirect method for finding the minimax-optimal filter 
coefficients. 


1. Using results from Approximation Theory, simple conditions for determining whether a 
given filter is 1° (minimax) optimal are found. 

2. An iterative method for finding a filter which satisfies these conditions (and which is thus 
optimal) is developed. 


That is, the L filter design problem is actually solved indirectly. 


Conditions for L-0»o Optimality of a Linear-phase FIR Filter 


All conditions are based on Chebyshev's "Alternation Theorem," a mathematical fact from 
polynomial approximation theory. 


Alternation Theorem 


Let F' be a compact subset on the real axis x, and let P(x) be and Lth-order polynomial 


L 
P(e) = [> apa* 
k=0 


Also, let D(x) be a desired function of x that is continuous on F’, and W(z) a positive, 
continuous weighting function on F’. Define the error E(x) on F' as 


and 


|| E(x) ||, =argmax |E(z)| 
2eFk 


A necessary and sufficient condition that P(a) is the unique Lth-order polynomial minimizing 
|| E(x) ||,, is that E(x) exhibits at least L + 2 "alternations;" that is, there must exist at least 
D+ 2 values of x, 2, € F,k = [0,1,..., 4-1], such that ro < 2] <... < @p42 and such 
that (etx) = —E(ans1) = +(\| E len) 

Exercise: 


Problem: What does this have to do with linear-phase filter design? 
Solution: 


It's the same problem! To show that, consider an odd-length, symmetric linear phase filter. 
Equation: 


Hw) = SMG hnje 


= ew) (n(45*) LOS Ss n( —n) cos(wn) 
Equation: 
A(w) = hA(L) +2 De h(L — n) cos(wn) 


Where L = “=!, 


Using trigonometric identities (such as 
cos(na) = 2cos((n — 1)a) cos(a) — cos((n — 2)a)), we can rewrite A(w) as 


A(w) = hA(L) 4+ 2 >», h(L — n) cos(wn) ay a, cos (w) 


where the a, are related to the h(n) by a linear transformation. Now, let 2 = cos(w). This 
is a one-to-one mapping from z € [—1, 1] onto w € [0, z]. Thus A(w) is an Lth-order 
polynomial in z = cos(w)! 


Note:The alternation theorem holds for the °° filter design problem, too! 


Therefore, to determine whether or not a length-/, odd-length, symmetric linear-phase 
filter is optimal in an LD sense, simply count the alternations in 

E(w) = W(w) (Aa(w) — A(w)) in the pass and stop bands. If there are L + 2 = “43 or 
more alternations, h(n), 0 <n < M — 1 is the optimal filter! 


Optimality Conditions for Even-length Symmetric Linear-phase Filters 


For M even, 
L 
A(w) = A(L -_ 
(w) , ( )oos(w (n+ 5)) 
where L = _ — 1 Using the trigonometric identity 


cos(a + 8) = cos(a@ — 8) + 2cos(a) cos(f) to pull out the $ term and then using the other 
trig identities, it can be shown that A(w) can be written as 


A(w) = cos( = ) Ss arcost w) 


Again, this is a polynomial in « = cos(w), except for a weighting function out in front. 
Equation: 


E(w) = W(w) (Aa(w) — Al) 
= W(w) (Aa(w) — cos(#) P(w)) 


W(w) cos(#) (Ase - P(w)) 


vole 


which implies 
Equation: 


where 


and 


eal cos( +(cos(z))*) 


Again, this is a polynomial approximation problem, so the alternation theorem holds. If E(w) 
has at least D + 2 = ae + 1 alternations, the even-length symmetric filter is optimal in an L™ 
sense. 


The prototypical filter design problem: 
1 if jw) <w, 
MS | & if feel < lal 


See [link]. 


L-oo Optimal Lowpass Filter Design Lemma 


1. The maximum possible number of alternations for a lowpass filter is L + 3: The proof is 
that the extrema of a polynomial occur only where the derivative is zero: ae = 0. Since 
P'(a) is an (LZ — 1)th-order polynomial, it can have at most ZL — 1 zeros. However, the 
mapping x = cos(w) implies that oA) = 0 at w = 0 and w = 7, for two more possible 
alternation points. Finally, the band edges can also be alternations, for a total of 
DZ-1+2+2=L +3 possible alternations. 

2. There must be an alternation at either w = 0orw = 7. 

3. Alternations must occur at w, and ws. See [link]. 


4. The filter must be equiripple except at possibly w = 0 or w = 7. Again see [link]. 


Note:The alternation theorem doesn't directly suggest a method for computing the optimal 
filter. It simply tells us how to recognize that a filter is optimal, or isn't optimal. What we need 
is an intelligent way of guessing the optimal filter coefficients. 


In matrix form, these LZ + 2 simultaneous equations become 


1 cos(wo) cos(2wy) ... — cos(Lw) Wa} A 
. 220) sey a. ae h(L) a(wo) 
nl) | 
h(0) 
ee ° Anton 
1 cos(wz41) cos(2wz41) ... cos(Lwy+1) W(wisi) 
or 


v(i)-4 


So, for the given set of Z + 2 extremal frequencies, we can solve for h and 6 via 

(hd)? = W-1A,. Using the FFT, we can compute A(w) of h(n), on a dense set of 
frequencies. If the old w, are, in fact the extremal locations of A(w), then the alternation 
theorem is satisfied and h(n) is optimal. If not, repeat the process with the new extremal 
locations. 


Computational Cost 


O(L’) for the matrix inverse and N log, N for the FFT (NV > 32L, typically), per iteration! 
This method is expensive computationally due to the matrix inverse. 
A more efficient variation of this method was developed by Parks and McClellan (1972), and is 


based on the Remez exchange algorithm. To understand the Remez exchange algorithm, we 
first need to understand Lagrange Interpoloation. 


Now A(w) is an Lth-order polynomial in z = cos(w), so Lagrange interpolation can be used to 
exactly compute A(w) from ZL + 1 samples of A(w,), & = (0, 1, 2,..., L]. 


Thus, given a set of extremal frequencies and knowing 6, samples of the amplitude response 
A(w) can be computed directly from the 
Equation: 


without solving for the filter coefficients! 
This leads to computational savings! 


Note that [link] is a set of Z + 2 simultaneous equations, which can be solved for 6 to obtain 
(Rabiner, 1975) 


Equation: 
5- ico VkAa(wr) 
E+1 (1) x 
where 
7 L+1 1 
oo sith cos(w%,) — cos(w;) 


The result is the Parks-McClellan FIR filter design method, which is simply an application of 
the Remez exchange algorithm to the filter design problem. See [link]. 


Tnitial guess of L+2 
extremal fequencies 


Compute 6 using 


the equation given 


Using Lagrange interpolation 
compute dense set of samples 
(typically, 16*L) of Af) over the 
pass and stop bands 


Determine new L+2 
largest extrema 


es 
Compute h(n) 


Alternation Theorem satisfied? 


The initial guess of extremal frequencies is 
usually equally spaced in the band. 
Computing 6 costs O(L’). Using Lagrange 
interpolation costs O(16LL) ~ O(16L7). 
Computing h(n) costs O(L*), but it is only 


done once! 


The cost per iteration is O(16L7) , aS Opposed to O(L*); much more efficient for large L. Can 
also interpolate to DFT sample frequencies, take inverse FFT to get corresponding filter 
coefficients, and zeropad and take longer FFT to efficiently interpolate. 


Lagrange Interpolation 


Lagrange's interpolation method is a simple and clever way of finding the unique L 
th-order polynomial that exactly passes through L + 1 distinct samples of a signal. 
Once the polynomial is known, its value can easily be interpolated at any point 
using the polynomial equation. Lagrange interpolation is useful in many 
applications, including Parks-McClellan FIR Filter Design. 


Lagrange interpolation formula 


Given an Lth-order polynomial 


L 
P(x) =ap +a,z+...4 arr’ = ) apa” 
k=0 


and L + 1 values of P(a;) at different x,, k € {0,1,...,L}, 2; A #,,1 FJ, the 
polynomial can be written as 


_S* p(y) 2 BW) (@ = 82) (@ = ea) (= ess) (@ =) 
) 7 oor ) (xx - 1) (x, ~~ PO) cas (xx = Pei) (xp, _ Opi) ees (2% = xz) 


cs 


=0 
The value of this polynomial at other z can be computed via substitution into this 


formula, or by expanding this formula to determine the polynomial coefficients a, 
in standard form. 


Proof 


Note that for each term in the Lagrange interpolation formula above, 


= L— Lj te 


and that it is an Lth-order polynomial in x. The Lagrange interpolation formula is 
thus exactly equal to P(,) at all x,, and as a sum of Lth-order polynomials is 
itself an Lth-order polynomial. 


It can be shown that the Vandermonde matrix 


1 29 wo .. xo” Qo P(x) 


1 2 x12 sae x12 a1 P(x) 
1 45°. 45% es ng? ay — P(x) 
L. ay xp" ah ape ar, P(zz) 


has a non-zero determinant and is thus invertible, so the Lth-order polynomial 
passing through all L + 1 sample points z; is unique. Thus the Lagrange 
polynomial expressions, as an Lth-order polynomial passing through the D + 1 
sample points, must be the unique P(z). 


Overview of IIR Filter Design 


IIR Filter 


y(n) = — S) axy(n — k) +S) bye (n — k) 
k=1 k=0 


H(2) by + by2! + bez? bee + bye ™ 
Oe Se 
ltajyzttagz?+...+ayz-4 


IIR Filter Design Problem 


Choose {a;}, {b;} to best approximate some desired | Hq(w)| or, 
(occasionally), Hg(w). 


As before, different design techniques will be developed for different 
approximation criteria. 


Outline of ITR Filter Design Material 


¢ Bilinear Transform Maps || L ||, optimal (and other) analog filter 
designs to || L ||, optimal digital IIR filter designs. 

¢ Prony's Method Quasi-|| L ||, optimal method for time-domain 
fitting of a desired impulse response (ad hoc). 

¢ Lp Optimal Design || L ||,, optimal filter design (1 < p < oo) using 
non-linear optimization techniques. 


Comments on IIR Filter Design Methods 


The bilinear transform method is used to design "typical" || L || ,,, 
magnitude optimal filters. The || L || p Optimization procedures are used to 
design filters for which classical analog prototype solutions don't exist. The 
program by Deczky (DSP Programs Book, IEEE Press) is widely used. 
Prony/Linear Prediction techniques are used often to obtain initial guesses, 


and are almost exclusively used in data modeling, system identification, and 
most applications involving the fitting of real data (for example, the 
impulse response of an unknown filter). 


Prototype Analog Filter Design 


Analog Filter Design 


Laplace transform: 
HG / ha(t)e" dt 


Note that the continuous-time Fourier transform is H(A) (the Laplace transform 
evaluated on the imaginary axis). 


Since the early 1900's, there has been a lot of research on designing analog filters of the 
form 


bo + bis 4 bys? eee S| byrs™ 
se eae Tr, eT A 


tajs+ags?+...+ays 


A causal IIR filter cannot have linear phase (no possible symmetry point), and design 
work for analog filters has concentrated on designing filters with equiriplle (|| L || ,,) 
magnitude responses. These design problems have been solved. We will not concern 
ourselves here with the design of the analog prototype filters, only with how these 
designs are mapped to discrete-time while preserving optimality. 


An analog filter with real coefficients must have a magnitude response of the form 
2 
(|H(A)|)° = B(A’) 
Equation: 
HOH = by +byiA+by (id)? +b, (id)? +... H(id) 


1+ayiA+a2(édr)?+... 


bobo? +b4A*+...+4A(b1—b3A7 +b5A4 +...) bo —b2A?2+b4A44... 474A (by —b3A2+b5A4+...) 
1—a2d?+a4A44...4¢\(a1—a3A2+45\4+...) 1—a2d?+a4A44+...+iA(a1—a3A2+45A4+...) 
(by —ba 2+ b4d4 +...) +2 (b1—bg 245M +...) 
(1—apd?-+a4M4+...)?+A2(a1—a3\2+a5A4+4...)” 


B(?) 


Let s = 2A, note that the poles and zeros of B ( —s*) are symmetric around both the 


real and imaginary axes: that is, a pole at p; implies poles at p1, p1, —pi, and —pj, as 
seen in [link]. 


Recall that an analog filter is stable and causal if all the poles are in the left half-plane, 
LHP, and is minimum phase if all zeros and poles are in the LHP. 


s = id: B(A’) = B(-s’) = H(s)H(—s) = H (iA)H (— (iA)) = A (iA) (iA) we 
can factor B(—s*) into H(s)H(—s), where H(s) has the left half plane poles and 
zeros, and H(—s) has the RHP poles and zeros. 


(|H(s)|)* = H(s)H(-—s) for s = id, so H(s) has the magnitude response B(’). 
The trick to analog filter design is to design a good B (Y’), then factor this to obtain a 
filter with that magnitude response. 


The traditional analog filter designs all take the form B(A”) = (|H(A)|)? = 


1 
14+ F(A?) ’ 
where F is a rational function in X?. 


Example: 
Be) piss 
2 _ 2-8% | (v2—s) (v2+s) 
Be as pata (s +a) (s—a)(s+a)(s—a) 
where a@ = =" 


Note:Roots of 1+ s% are N points equally spaced around the unit circle ((link]). 


Take H(s) = LHP factors: 


—  v2+s —  va+s 
HONS cea) p CORSE 


Traditional Filter Designs 


Butterworth 
1 
BO |= se 
0") 1+ AM 


Note:Remember this for homework and rest problems! 


"Maximally smooth" at A = 0 and A = oo (maximum possible number of zero 
derivatives). [link]. 


B()”) = (|H(A)I)” 


Chebyshev 


1 
1 +462Cy?(X) 


B(d’) 


where C'yy?(X) is an M*" order Chebyshev polynomial. [link]. 


Inverse Chebyshev 


[link]. 


Elliptic Function Filter (Cauer Filter) 


1 
B(d?*) = 
) 1+ €?Jy7(A) 


where Jy is the "Jacobi Elliptic Function." [link]. 


The Cauer filter is || Z ||, optimum in the sense that for a given M, 5,, 65, and Xp, the 
transition bandwidth is smallest. 


That is, it is || Z ||, optimal. 


UR Digital Filter Design via the Bilinear Transform 


A bilinear transform maps an analog filter H,(s) to a discrete-time filter 
H(z) of the same order. 


If only we could somehow map these optimal analog filter designs to the 
digital world while preserving the magnitude response characteristics, we 
could make use of the already-existing body of knowledge concerning 
optimal analog filter design. 


Bilinear Transformation 


The Bilinear Transform is a nonlinear C —> C mapping that maps a 
function of the complex variable s to a function of a complex variable z. 
This map has the property that the LHP in s (9%(s) < 0) maps to the 
interior of the unit circle in z, and the iA = s axis maps to the unit circle e’ 
in z. 


e) 


Bilinear transform: 


amas | 
=a 
Za 
| 
WZ) =.) 32= 
(2) (: . eat ) 
ee w_y — _ (e*-1) (e~ ™)41) _  sinWw) _. 
Note: 0 = Cle = (41) (ey = 242 cos(w) = iatan($), so 


A= atan(*), y= 2 arctan( Z ). [link]. 


2 a 


The magnitude response doesn't change in the mapping from A to w, it is 
simply warped nonlinearly according to H(w) = Hy (a tan ( S ) ) , [link]. 


The first image implies the second one. 


|H,O)I 


Note:This mapping preserves || L ||,, errors in (warped) frequency bands. 
Thus optimal Cauer (|| Z ||.) filters in the analog realm can be mapped to 
|| Z ||,, optimal discrete-time IIR filters using the bilinear transform! This 
is how IIR filters with || L ||, optimal magnitude responses are designed. 


Note:The parameter a provides one degree of freedom which can be used 
to map a single Ag to any desired wo: 


AR= atan(>) 


Or 


Ao 


tan (<) 


a= 


This can be used, for example, to map the pass-band edge of a lowpass 
analog prototype filter to any desired pass-band edge in w. Often, analog 
prototype filters will be designed with A = 1 as a band edge, and a will be 
used to locate the band edge in w. Thus an M** order optimal lowpass 
analog filter prototype can be used to design any M* order discrete-time 
lowpass IIR filter with the same ripple specifications. 


Prewarping 


Given specifications on the frequency response of an IIR filter to be 
designed, map these to specifications in the analog frequency domain which 
are equivalent. Then a satisfactory analog prototype can be designed which, 
when transformed to discrete-time using the bilinear transformation, will 
meet the specifications. 


Example: 

The goal is to design a high-pass filter, ws = Ws, Wp = Wp, 05 = Os, Op = Op 
; pick up some @ = ap. In [link] the 6; remain the same and the band edges 
are mapped by A; = ap tan(+). 


Where A; = ay tan (+) and Ap = ag tan(=). 


[Ho | 


Impulse-Invariant Design 


Pre-classical, adhoc-but-easy method of converting an analog prototype filter 
to a digital IIR filter. Does not preserve any optimality. 


Impulse invariance means that digital filter impulse response exactly equals 
samples of the analog prototype impulse response: 


Vn : (h(n) = ha(nT)) 
How is this done? 


The impulse response of a causal, stable analog filter is simply a sum of 
decaying exponentials: 


bp + bis + bes? +...+ bps? Ay zt Ao A, 


1+ ais + ags? + ... + aps? S— 8 8 — 8» S—S8, 


H,(s) 
which implies 
ha(t) = (Aie*" -+- Aves" +... A,e*”’)u(t) 
For impulse invariance, we desire 
hin) hen) = (Aje2"™™ Ase ate. Ape" )u(n) 


Since 
Aze*?)"u(n) = 


where |z| > |e**”|, and 


where |z| >max; {k, |e? |}. 


This technique is used occasionally in digital simulations of analog filters. 


Exercise: 
Problem: 
What is the main problem/drawback with this design technique? 
Solution: 
Since it samples the non-bandlimited impulse response of the analog 
prototype filter, the frequency response aliases. This distorts the original 


analog frequency and destroys any optimal frequency properties in the 
resulting digital filter. 


Digital-to-Digital Frequency Transformations 


Given a prototype digital filter design, transformations similar to the 
bilinear transform can also be developed. 


Requirements on such a mapping gis g(z"): 


1. points inside the unit circle stay inside the unit circle (condition to 
preserve stability) 
2. unit circle is mapped to itself (preserves frequency response) 


This condition implies that e~ 1) = g(e~()) = |g(w)|e*<(9)) requires 
that | g(e™)) | = 1 on the unit circle! 


Thus we require an all-pass transformation: 


Pp —1 
ai)... z — Ak 
g(z ) “on 1 — azz-! 


where |ax| < 1, which is required to satisfy this condition. 


Example: 
Lowpass-to-Lowpass 


ai a = Gh 
ZI — 
il =a! 


which maps original filter with a cutoff at w, to a new filter with cutoff oe 


sin(+ (we — w,)) 


_ sin(> (we + w,)) 


Example: 
Lowpass-to-Highpass 


Sy BEG 


Ze DSO 
; 1+az! 


which maps original filter with a cutoff at w, to a frequency reversed filter 
with cutoff wi, 


cos ( = ee we) 


yy 
cos(+ (we + wt)) 


(Interesting and occasionally useful!) 


Prony's Method 
Prony's Method is a quasi-least-squares time-domain IIR filter design method. 


First, assume H(z) is an "all-pole" system: 
Equation: 


bo 


72 —— 
1+ Dp ane 


and 


M 
— So agh(n — k) + bd(n) 
k=1 


where h(n) = 0, n < 0 for a causal system. 
Note:For h = 0, h(0) = bo. 


Let's attempt to fit a desired impulse response (let it be causal, although one 
can extend this technique when it isn't) hg(n). 


A true least-squares solution would attempt to minimize 


(ee) 


=) (\ha(n) — h(n)I)° 


n=0 


where H(z) takes the form in [link]. This is a difficult non-linear 
optimization problem which is known to be plagued by local minima in the 
error surface. So instead of solving this difficult non-linear problem, we solve 
the deterministic linear prediction problem, which is related to, but not the 
same as, the true least-squares optimization. 


The deterministic linear prediction problem is a linear least-squares 
optimization, which is easy to solve, but it minimizes the prediction error, 


not the (|desired — actual|)* response error. 


Notice that for n > 0, with the all-pole filter 
Equation: 


M 
h(n) = — Ss axh(n — k) 
k=1 


the right hand side of this equation is a linear predictor of h(n) in terms of 
the M previous samples of h(n). 


For the desired reponse hg(n), one can choose the recursive filter coefficients 
a, to minimize the squared prediction error 
2 


23 


n=1 


M 
ha(n) + S— agha(n — k) 
k=1 


where, in practice, the oo is replaced by an NV. 


In matrix form, that's 


ha(0) 0 0 ‘ ha(1) 

ha(1) ha(0) 0 as ha(2) 

hg(N — 1) ha(N — 2) ha(N—M)} \am ha(N) 
Hya = —hg 


The optimal solution is 


ay = — (HaHa) 'Ha"tha) 


Now suppose H(z) is an M‘"-order IIR (ARMA) system, 


or 
Equation: 


h(n) 


M = 
H(z) __ k=0 bpz ‘ 
a3) eames = 
1+ Dd pa1 Oh? 


— Wxanr anh(n — k) + yan b;,6(n — k) 
~ ft agh(n — k) + dy if 0<n<M 
— yom azh(n—k) if n>M 


For n > M, this is just like the all-pole case, so we can solve for the best 
predictor coefficients as before: 


ha(M) ha(M-—1) ... ha(1) ay ha(M + 1) 
ha(M + 1) ha(M) ce ha(2) ao hg(M =F 2) 
ha(N =) ha(N — 2) . ha(N — M) a ha(N) 

Hua = ha 
and 


ae! aud 
Qopt = (ce Ha) Haha 


Having determined the a's, we can use them in [link] to obtain the 6,,'s: 


M 
bn = agha(n — k) 
k=1 


where ha(n — k) = 0 forn —k < 0. 


For N = 2M, H q is Square, and we can solve exactly for the a;'s with no 
error. The b;'s are also chosen such that there is no error in the first M+ 1 
samples of h(n). Thus for VN = 2M, the first 2M + 1 points of h(n) exactly 
equal hq(n). This is called Prony's Method. Baron de Prony invented this in 
1795, 


For N > 2M, ha(n) = h(n) for 0 < n < M, the prediction error is 
minimized for M+ 1 < n< N, and whatever for n > N + 1. This is called 
the Extended Prony Method. 


One might prefer a method which tries to minimize an overall error with the 
numerator coefficients, rather than just using them to exactly fit hg(0) to 


ha(M). 


Shank's Method 


1. Assume an all-pole model and fit hg(n) by minimizing the prediction 
errorl<n< WN. 

2. Compute u(7), the impulse response of this all-pole filter. 

3. Design an all-zero (MA, FIR) filter which fits v(n)*h,(n) ~ ha(n) 
optimally in a least-squares sense ((link]). 


The final IIR filter is the cascade of the all-pole and all-zero filter. 


This is is solved by 


N 
minp, bx, S- ( ha(n) — 


n=0 


or in matrix form 


v(0) 0 0 0 bo ha(0) 
v(1) v0) 0 0 b ha(1) 
v(2) —v(1) v(0) 0 bo | ~ | Ag(2) 
»(N) »(N — 1) o(N — 2) . u(N 7 M) i ha(N) 


Which has solution: 
bot = (VEV) ‘VER 


Notice that none of these methods solve the true least-squares problem: 


MiNgb feb (\ha(n )— rin? | 
n=0 


which is a difficult non-linear optimization problem. The true least-squares 
problem can be written as: 


oo 
ming, Qa,;5, p, S- ( 


n=0 


M 
ha(n) — ys a,e?” 


t=1 


since the impulse response of an IIR filter is a sum of exponentials, and non- 
linear optimization is then used to solve for the a; and (,. 


Linear Prediction 


Recall that for the all-pole design problem, we had the overdetermined set of linear equations: 


ha(0) 0 os 0 ay ha(1) 
ha(1) ha(0) 0 ay ha(2) 
ha(N =i} ha(N = 2) . ha(N — M) a ha(N) 


with solution a = (Ha"H.) Aha 


Let's look more closely at H eH d = R. rj; is related to the correlation of hg with itself: 
N-max{i,j} 
rg= >> halk)ha(k+ |é — Jl) 
k=0 


Note also that: 


Hatha _ ra(3) 


where 
N-i 
ra(t) = > ha(n)ha(n +7) 
n=0 
so this takes the form a@opt = — (R¥ ra), or Ra = —7r, where Ris M x M,ais M x 1,andrisalsoM x 1. 


Except for the changing endpoints of the sum, r;; ~ r(i — j) = r(j — 7). If we tweak the problem slightly to 
make rj; = r(t — J), we get: 


r(0) r(1) r(2) ... r(M-—1) a, >(1) 
r(1) (0) rQ) a r(2) 
r(2) (1) r(0) az = — 7(3) 
»(M —1) a . . »(0) aM r(M) 


The matrix # is Toeplitz (diagonal elements equal), and a can be solved for with O(M ) computations using 
Levinson's recursion. 


Statistical Linear Prediction 
Used very often for forecasting (e.g. stock market). 


Given a time-series y(n), assumed to be produced by an auto-regressive (AR) (all-pole) system: 


M 
y(n) = — > agy(n — k) + u(n) 
k=1 


where u(n) is a white Gaussian noise sequence which is stationary and has zero mean. 


To determine the model parameters {a;,} minimizing the variance of the prediction error, we seek 
Equation: 


ming, {an E|(y(n) + D2 annin —¥)) |b = min, (an, B[y(n) +258, axvlndu(n — 8) + Df 


= min, {ax, Ely?(n)] +2014, ax Ely(n)y(n — k)) +O 


Note:The mean of y(7) is zero. 


Equation: 


e? = r(0)+2(r(1) r(2) 7(3) r(M)) 9% + (a1 ag a3 am) r(2) r(1)_ (0) 
“a r(M — 1) 
Equation: 
2 
a = 2r+2Ra 
Oa 
Setting [link] equal to zero yields: Ra = —r These are called the Yule-Walker equations. In practice, given 
samples of a sequence y(7), we estimate r(n) as 
1 N=n 
r(n)= + Yo u(n)y(n + k) ~ Ely(k)y(n + k)| 
k=0 


which is extremely similar to the deterministic least-squares technique. 


